5 research outputs found

    Testing the Accuracy of Query Optimizers

    Get PDF
    ABSTRACT The accuracy of a query optimizer is intricately connected with a database system performance and its operational cost: the more accurate the optimizer's cost model, the better the resulting execution plans. Database application programmers and other practitioners have long provided anecdotal evidence that database systems differ widely with respect to the quality of their optimizers, yet, to date no formal method is available to database users to assess or refute such claims. In this paper, we develop a framework to quantify an optimizer's accuracy for a given workload. We make use of the fact that optimizers expose switches or hints that let users influence the plan choice and generate plans other than the default plan. Using these implements, we force the generation of multiple alternative plans for each test case, time the execution of all alternatives and rank the plans by their effective costs. We compare this ranking with the ranking of the estimated cost and compute a score for the accuracy of the optimizer. We present initial results of an anonymized comparisons for several major commercial database systems demonstrating that there are in fact substantial differences between systems. We also suggest ways to incorporate this knowledge into the commercial development process

    Memory Aware Query Scheduling in a Database Cluster

    No full text
    Query throughput is one of the primary optimization goals in interactive web-based information systems in order to achieve the performance necessary to serve large user communities. Queries in this application domain dier signicantly from those in traditional database applications: they are of lower complexity and almost exclusively read-only. The architecture we propose here is specically tailored to take advantage of the query characteristics. It is based on a large parallel shared-nothing database cluster where each node runs a separate server with a fully replicated copy of the database. A query is assigned and entirely executed on one single node avoiding network contention or synchronization eects. However, the actual key to enhanced throughput is a resource ecient scheduling of the arriving queries. We develop a simple and robust scheduling scheme that takes the currently memory resident data at each server into account and trades o memory re-use and execution time,..

    ABSTRACT On Optimal Pipeline Processing in Parallel Query Execution

    No full text
    and their applications. SMC is sponsored by the Netherlands Organization for Scientific Research (NWO). CWI is a member o
    corecore